A Parameterizable FPGA Prototype of a Vector-Thread Processor
نویسندگان
چکیده
The SCALE prototype board has been fabricated and consists of a single Xilinx XC2V4000 FPGA connected to a number of Micron DDR2 DRAMs that form the SCALE memory system. To support power measurements, the board is divided into multiple separate power islands. The board is attached to a test baseboard that provides sixteen independently measurable power supply connections and a byte-serial connection to a Linux PC that acts as the system host. The current version of the board has a variety of additional DRAM parts with independent power supplies to support experiments in DRAM power control. The final version of the SCALE board will replace these additional DRAMs with a socket for the SCALE chip.
منابع مشابه
A Parameterizable Processor Architecture for Large Characteristic Pairing-Based Cryptography
Cryptographic pairing (bilinear mapping) is a core algorithm for various cryptography protocols. It is computationally expensive and inefficiently computed with general purpose processors. Although there has been previous work looking into efficient hardware designs for pairing, most of these systems use small characteristic curves which are incompatible with practical software designs. In this...
متن کاملSimty: a Synthesizable General-Purpose SIMT Processor
Simty is a massively multi-threaded processor core that dynamically assembles SIMD instructions from scalar multi-thread code. It runs the RISC-V (RV32-I) instruction set. Unlike existing SIMD or SIMT processors like GPUs, Simty takes binaries compiled for generalpurpose processors without any instruction set extension or compiler changes. Simty is described in synthesizable RTL. A FPGA prototy...
متن کاملSimty: generalized SIMT execution on RISC-V
We present Simty, a massively multi-threaded RISC-V processor core that acts as a proof of concept for dynamic inter-thread vectorization at the micro-architecture level. Simty runs groups of scalar threads executing SPMD code in lockstep, and assembles SIMD instructions dynamically across threads. Unlike existing SIMD or SIMT processors like GPUs or vector processors, Simty vectorizes scalar g...
متن کاملRapid Prototyping of the Data-Driven Chip-Multiprocessor (d2-CMP) Using FPGAs
This paper presents the FPGA implementation of the prototype for the Data-Driven Chip-Multiprocessor (D2-CMP). In particular, we study the implementation of a Thread Synchronization Unit (TSU) on FPGA, a hardware unit that enables thread execution using dataflow-like scheduling policy on a chip multiprocessor. Threads are scheduled for execution based on data availability, i.e., a thread is sch...
متن کاملFPGA Prototyping of Manycore Multinode Systems for Irregular Applications
Knowledge discovery applications are an emerging class of irregular applications that exploit graph-based data structures, present poor locality and analyze very big data sets that require multi-node systems for processing. Current commodity clusters, which exploit cachebased processors, usually perform poorly with these applications. To address their requirements, full-custom machines, like th...
متن کامل